[an error occurred while processing this directive]

Java for C and C++ Programmers


Table of Contents

Java Data

There are two kinds of Java data: simple data and objects. Simple java data are boolean, character, integer, and real values. Java objects are encapsulations of various kinds of data components, together with methods for manipulating the components and returning information about them. Methods are like C functions, although the syntax for their use is different.

Simple Java Data

Simple Java data has one of eight types as listed below.

Type Contains Size, Coding, or Values
boolean truth value true, false
char character Unicode characters
byte signed integer 8 bit two's complement
short signed integer 16 bit two's complement
int signed integer 32 bit two's complement
long signed integer 64 bit two's complement
float real number 32 bit IEEE 754 floating point
double real number 64 bit IEEE 754 floating point

The simple types of Java data can be used in much the same way as the corresponding types in C. The syntax for literals and expressions is identical except for some additional escape sequences to handle Unicode characters in character and String literals. Java does not have pointers, structs or unions. Their functionality, and more, is provided by Java objects.

With regard to simple data types, the main difference between C and Java is that Java does not automatically coerce between the integral types, the boolean type, and the character type. This implies that Java integral types and characters cannot be used by themselves as conditions in control statements. For example, for an integer variable x, the following statement is legal in C.

    while (x) {
	...
    }
In Java, the above statement is not legal because the control expression in a while loop must express a boolean value. Thus the loop must be written as follows.
    while (x != 0) {
	...
    }

Java Objects

An object is an encapsulation of data along with methods for manipulating the data. Java objects are grouped into classes. Two objects in the same class contain the same kind of data components and are manipulated by the same set of methods. In Java, classes are regarded as a special kind of object.

There are many Java classes that are defined in the standard class library. In addition, programmers may define their own classes. In fact, almost all Java coding is involved in the definition of classes.

Java Variables and Assignments

Other than literal values, all Java data is accessed through variables, which can be associated with objects (instance variables), classes (class variables), or instantiations of methods (parameters and local variables). All variables are typed as a simple type, or a class, or an interface. An interface is like a class whose methods are declared but not defined. That is, the methods are named, and types are specified for their returned values and parameters, but no code is provided to define their behavior.

Java variables with simple types contain value copies. Although two variables may contain the same value, they will have two distinct copies. If one of the variables is changed, it has no effect on the other. Thus the behavior of simple Java variables is similar to their C counterparts.

Java object variables, on the other hand, are references to objects. Two object variables may refer to the same object. If the object is modified then the change can be seen through both variables.

Modifying Java Variables

The only mechanism for changing the value of a simple Java variable is an assignment statement. Java assignment syntax is identical to C assignment syntax. As in C, an assignment replaces the value of a variable named on the left- hand side of the equals sign by the value of the expression on the right- hand side of the equals sign.

Java object variables can be changed in two ways. Like simple variables, you can make assignments to object variables. When this is done the object referenced by the variable is not changed. Instead, the reference is replaced by a reference to a different object.

With a few exceptions, the only other thing that you can do with an object variable is to send it a message. This is an important part of any Java program, allowing communication between objects.

Java Messages

The syntax of a Java message closely resembles a C expression that accesses a member of a struct. Its form is one of the following:

    receiver.method-name(parameter list)
or
    receiver.variable-name

In both forms, receiver is an expression that denotes the receiver of the message. This expression can be a variable or class name, or an indexed array expression, or any complex Java expression that has an object as its returned value. The legality of both forms depends on access conditions, described in the section Java Access Conditions below.

In the first form, method-name is the name of the method to apply, and parameter list is a possibly empty comma separated list of parameters. Except for the receiver and the period at the front, the first message form has the same syntax as a C function call. Like C function calls, the expression can have effects and can return data. Normally, the direct effects are limited to changes in the receiving object.

In the second form, variable-name is the name of a variable attached to the receiver. The expression returns the value or reference associated with that variable.

For example, consider the following statement, where x is the name of an object variable.

    System.out.println(x.getClass().getName());
Here, System is the name of a standard library class. This class has an attached variable, out, which is the standard output stream for a program, much like stdout in C programs. This variable is an object of class PrintStream. The PrintStream class defines the println() method, which prints its argument followed by a newline. The argument should be an object from the String class.

The getClass() method is defined for all objects, returning the class (an object of class Class) to which the object belongs. The Class class defines the getName() method, which returns the name of the class (an object of class String). Thus the above statement prints the name of the class to which x belongs. With the possible exception of the class of x, all of the classes in this example are Java standard library classes.

Creating Java Objects

Most Java objects are created using the keyword new with a call to a constructor method. A class can provide a default parameterless constructor that is inherited from the Object class or specialized constructors can be defined in the class.

Java constructor definitions are similar to method definitions except for two things: the name of the constructor must be the class name and the type of the returned value is omitted. Most constructors are declared public, so their definition has the form

    public class-name(typed parameter list) {
        object initialization code
    }
The typed parameter list has the same syntax as the parameter list in a C function definition.

A constructor is usually called in a new expression, which has the form

    new class-name(parameter list)
This expression returns a new instance of the class named by class-name. The expression can be used in any context where an object of this class is legal. For example it could be the right-hand side of an assignment statement to a variable of the appropriate type, or it could be used as a parameter in a method call.

There is one context where a constructor is called without a new expression: inside a constructor for an object of the same class. In this case, the initialization code of the called constructor is executed without creating a new object. Two special syntax forms are used to do this:

    this(parameter list);
and
    super(parameter list);
The first form performs the initialization code in the constructor whose parameter types match the actual parameters (usually a different constructor). The second form performs the initialization code in a constructor defined in the superclass.

Often, there is one primary constructor for a class, with all of the necessary parameters for constructing an object, and one or more secondary constructors that omit some of the parameters. The initialization code for the secondary constructors is just a call to the primary constructor with default values provided for the omitted parameters. For example, in the String standard library class, there is a primary constructor with a String parameter that returns a copy of the parameter. There is also a secondary constructor that returns an empty String. The code for the secondary constructor is

    public String() {
	this("");
    }
The this statement just calls the primary constructor, providing an emtpy String as a default value.

When designing the constructors for a class, data integrity is an important consideration. All public constructors should create objects that satisfy data integrity constraints. All classes have a default parameterless constructor that only does initialization specified in the default constructor for the superclass. If that does not create an object that satisfies data integrity constraints then a new default constructor should be defined for the class, overriding the inherited one.

Sometimes there is no reasonable way of defining a parameterless constructor. In that case, the default constructor should be declared as either private or protected. The latter is sometimes useful for allowing subclasses to call the default constructor in their initialization code.

Defining Java Classes

In it simplest form, a Java class definition has the following structure.

    public class A {
	
	object and class member definitions

    }
The object and class member definitions are method (including constructor), variable, and constant definitions, which are like C function, variable, and constant definitions with the exception of some keyword modifiers as described below.

There is one important difference between Java member definitions and their C counterparts: Java does not require that class members be defined prior to their use. A Java compiler works like an assembler in that it makes multiple passes through a code file. The early passes are just recording symbols and types associated with them.

There is a price to pay for this added freedom: it complicates error handling by the compiler. Error messages from the compiler can be frustrating because as you fix errors, new errors are uncovered. It is not uncommon to get an early report of a small number of errors and a large number of errors after the first errors are fixed.

Instance and Class Members

In Java, members may be associated with either objects or classes. Members that are associated with objects are usually called instance members. In a class definition, all members are instance members except those that are qualified by the static keyword. In a message that accesses an instance member, the receiver is specified by a variable or expression that references an object. In a message that accesses a class member, the receiver is specified by the name of a class.

Instance and Class Variables

The most important variables that are defined in a class are instance variables. Instance variables are attached to objects so that each object in a class has its own instance variables. All variable declarations are declarations of instance variables unless qualified by the static keyword.

The static keyword indicates that the variable is a class variable. A class variable is attached to the class in which its definition appears so that its value or reference is shared by all objects in the class. The following declaration declares x to be a class variable of type int.

    static int count;

If this declaration appears in class A then the variable is accessed with the message A.count. This message returns the value of the variable, which may be part of a more complex expression, such as

    System.out.println(A.count);

Instance and Class Methods

Like variables, methods are instance methods except when declared as class methods using the keyword static. If class A defines method f() as static then messages using the method must be sent to the class as in

    x = A.f();

Constants: final Variables

Unlike C, Java uses the keyword final to declares constants. Most constants are also class variables, so they are usually declared like

    static final int taxRate;

Access Conditions in Java

Java uses the keywords public, protected, and private to modify access to members in a class definition. These keywords determine the kinds of scope from which a member can be accessed. There is also a default scope for members that are defined without an access keyword. Java Scopes

Java Scopes

A scope is a limited portion of source code that provides a context for interpreting identifiers or names. An identifier can refer to different things depending on the scope in which it appears. This is one of the important mechanisms for encapsulation.

As in C, the scope for a local variable or parameter is the method or innermost block in which it is declared. A block scope is either a control statement or a segment of code delimited by braces. For example, the scope for the local variable i in the following code is the entire for statement. The scope of the local variable x is the region enclosed by the braces.

for (int i = 0; i < 100; i++) {
    double x;
    .
    .
}

There are two scopes in Java that are not available in C: class scopes and package scopes. A class scope consists of an entire class definition. A package scope is a directory containing java source code files. To be used in a package scope, a source code file should contain a statement with the following form.

    package package-name;

Here, package-name is the name of the package. Rules for naming class source code files and packages are described in the section Naming Java Source Code Files and Packages.

Limiting Access for Members

Access to a member of a Java class can be limited to a scope using one of the keywords public, protected, or private, or to a default scope if none of these keywords is used. If an access keyword is used, it should precede the type for the member.

If an instance variable or method of class A is declared with the public keyword, it can be accessed through any reference to an object from class A. If an instance variable or method of class A is declared with the protected keyword, it can be accessed only within the package that contains A or in the class definition of a subclass of A. If an instance variable or method of class A is declared with the private keyword, it can be accessed only in the class definition of A. If an instance variable or method of class A is declared without one of these keywords, it can be accessed only within the package that contains A.

The Context for Writing Java Methods

When you are writing an instance method in a class definition, you have access to all of the instance variables for the object that receives a message with that method, along with all of the class variables for the class. In addition, you can send a message to the receiving object or its class without specifying the receiver. That is, a message from an object to itself or to its class looks just like an ordinary C function call or variable reference:

    method-name(parameter list)
or
    variable-name
Messages to all other objects or classes must specify the receiver using the variable or class name, followed by a period, followed by the function call or variable name, as shown earlier.

Dealing with null variables

There is a special value, null, that can be assigned to any object or array variable. It indicates that there is no object or array referenced by the variable. For example, a null value can be used for lists to indicate that the list is empty.

If a program attempts to send a message to a variable whose current value is null then the Java runtime system will throw a NullPointerException. This exception is also thrown when a program attempts to access entries of an array variable when the current value of the variable is null.

It would be nice if you could encapsulate handling of null values in a class that defines lists or other linked structures. Unfortunately, you cannot do this with a single list class for two reasons.

First, the null value does not reference a special object. It just indicates that there is no object referenced by the variable. So you cannot send it a message. This requires that clients of the list class use code to test for null variables and act appropriately. The list class must provide most of the functionality that you need, but it must leave handling the empty list case to the client.

The second reason is that a method for the list class cannot change the reference for a variable that is the receiver of a list message. Suppose you are writing a method that removes an entry from a list. To handle the case where the list only contains one entry, you need to make the list variable null. But this cannot be done within the remove method. The method has access to the receiver object, but not the variable that references it. Thus the sender of the message must set the variable to null.

This kind of behavior is unfortunate, and it can lead to headaches. Higher-level classes should attempt to encapsulate this problem. To assist the higher-level classes, the lower-level class can return the value that should be assigned by the higher-level class. The higher-level classes must then assign the null value to the desired variable.

Java Inheritance

When a Java class is defined, it can be defined as an extension of another class using the keyword extends. For example, the following class definition declares that class B extends the definition of class A.

    public class B
    extends A {
	
	instance and class member definitions

    }
This means that all public and protected variables and methods of class A are automatically defined for class B as well. The variables and method do not need to be defined again in class B unless they need to be changed. In standard object-oriented terminology, we say that B inherits its variables and methods from A. A Java class can only inherit from a single parent class, which is called its superclass. If the extends clause is not used in a class definition then its superclass is the class called Object.

Abstract Classes

Sometimes, it is convenient to define a class that can have no instances. For example, a class can provide implementation for methods that are common to two or more classes, but leave implementation of other methods to subclasses. If the parent class declares the unimplemented methods then it uses the abstract keyword in the method declaration and omits the method body. When this is done, the class is called an abstract class. The abstract keyword should also be added to the class definition as in the following.

    public abstract class A {

	member definitions
	
	public abstract boolean func(int n);

	more member definitions

    }

An abstract class is, by itelf, useless, but it defines a common interface for its subclasses. Subclasses that define all of the abstract methods are called concrete subclasses. For example, to define class B as a concrete subclass of class A, you use a class definition like the one below.

    public class B
    extends A {

	new member definitions
	
	public boolean func(int n) {
	    implementation code
	}

	more new member definitions

    }

The power of abstract classes is that the class can be used as a type for variables and parameters. Objects from any concrete subclass can be assigned to the variables or passed through the parameters. This capability can be used to make code more immune to changes. For example, an abstract class could be used to define a common interface for several implementations of an abstract data type. If the abstract class is used to type variables and parameters then the only place where the code needs to be changed when the implementation is changed is in calls to constructors.

Java Interfaces

A Java interface is similar to an abstract class in that it has undefined methods. In fact, all of the methods in an interface must be given with declarations but no definition. That is, only method prototypes appear in an interface definition:

    public interface C {

	public boolean func(int n);

	more method declarations

    }

The only variables that can be declared in an interface definition are final variables (constants). Constants are almost always declared to be static.

Java inheritance rules allow a class to have a single superclass, but it can implement any number of interfaces. The following form is used to define a class E with superclass A that impleements two interfaces C and D.

    public class E
    extends A
    implements C, D {

	definitions of members declared in A, C, and D

	new member definitions

    }

If some of the members that are declared by A, C, or D are not defined in E, then E should be declared as abstract.

Java abstract classes and interfaces both serve similar purposes. However, in a particular situation, one will usually serve be better suited than the other. The tradeoffs between the two is beyond the scope of this web page. For a good advanced discussion of the issues see CM96.

Java Arrays and Strings

There are two kinds of Java objects that receive special syntactic treatment: arrays and strings. In Java, arrays are objects. However, you can use an indexing notation with arrays that is similar to C array indexing. To declare an array of int, for example, you use a declaration like the following.

    int[] A = new int[5];
Then individual entries can be accessed using a notation like A[i].

In Java, character strings are not usually handled as arrays of charaters. Instead, there is a standard library class String for dealing with them. Although you cannot use array notation with objects of class String, Java provides three members that facilitate working with them. First, there is a toString() method that is defined for all objects. This method is redefined in many standard library subclasses to return a String description appropriate to the class. Programmers can redefine toString() in their own classes to whatever they want for a description.

Second, in an expression, such as a println() argument, where a String is expected, Java will automatically convert data to strings. This is done by calling toString() for objects, and using built-in conversion routines for simple data types.

Finally, Java uses + to indicate concatenation of String objects. If x has value 5 then the following code prints out the text line "The value of x is 5.".

    System.out.println("The value of x is " + x + ".");

Java Generic Data Structures and Coercion

The Java standard class library contains a few useful generic data structure classes, such as Hashtable, Vector, and Stack, and interfaces such as Enumeration. These classes and interfaces are designed to deal with collections of objects. The objects that they hold are declared as class Object so that these structures can be used in a wide range of contexts. However, this creates a problem when you retrieve data from one of these collections: you cannot directly assign the returned value to a variable unless it is declared to have type Object. But then you cannot send any messages to the variable except those that are defined for class Object.

You will normally want to be able to send any message defined for the actual class of the object. In order to do this, you need to coerce the returned value to the expected class. For example, suppose you have an Enumeration variable named enum that contains String objects. You can get one of the objects with the nextElement() method, but the declared type of the returned value for this method is class Object. In order to deal with it as a String, you need to use a coercion expression, as in the following.

    (String)(enum.nextElement())

Then the value of this expression can be assigned to a String variable or passed as an argument to the println() method.

In general, coercion is accomplished by putting a parenthesized type name in front of an expression. Although parentheses around the expression itself are not always needed, using them avoids possible misinterpretations.

Java Standard Library Classes

Complete documentation for Java standard library classes can be found in Java 2 Platform API Specification The following packages are the most commonly used. If a source code file refers to anything from standard library classes other than those in the java.lang package then the file must have an import statement.

Naming Java Source Code Files and Packages

With some exceptions that are beyond the scope of this web page, each Java class is defined in a separate source code file. The name of the file should be the same as the class name with a .java suffix. Thus the definition for a class named Comparators should be in a file named Comparators.java. If the operating system uses case-sensitive file names then the .java suffix should be all lower case and the case of the rest of the file name should match the case of the class name.

It is a good programming practice to group source code files into packages of related classes, with each package in a separate directory. For compatibility with Java development software, the directories should be arranged as one or more heirarchies, each with the name classes. To enable development software to accesss the packages, you need to set a CLASSPATH environmental variable in your login startup script (the .personal file for UNIX machines here at UMD). For example, I have most of my classes rooted in my subdirectory public/lib. To allow development software to access the classes, I have added the following line to my .personal file:

    setenv CLASSPATH .:$HOME/myjava/lib
In general, the CLASSPATH variable is set to a list of directories separated by colons. In the above command, the list contains two directories: my classes directory and the current directory (using the UNIX period abbreviation). The current directory is added to allow access to classes in the current directory that do not have package statements.

For import statements, described in the next section, it is important to know how packages are named in Java. By way of example, my classes directory contains a subdirectory named gshute/util. Each class in this subdirectory has the following package statement:

    package gshute.util;
that is, the package name is the path name of the subdirectory, relative to my CLASSPATH variable, with periods replacing the slashes.

Import Statements

When you are writing code for a class, it will often play the role of a client for another class, which plays the role of a server class. In Java, the server class and any classes appearing in its interface must be imported by the client. This includes the server class itself and any classes used to define method parameter types or returned value types for server methods. Classes that are part of the package that contains the client class or part of the java.lang standard library package do not need import statements.

Server classes are imported with import statements. These statements should appear near the beginning of a Java source code file, immediately after the package statement. An import statement has the following syntax:

    import fully-qualified-class-name;
A fully-qualified-class-name has the form
    package-name.class-name

For example, if I want to import my Comparators class, which is defined in the gshute/util subdirectory then I use the following import statement.

    import gshute.util.Comparators;

It is a common practice to import a complete package. This can be done with the wild card "*" replacing the class name. For example, the following import imports the entire java.util package.

    import java.util.*;

Java Source Code File Structure

A Java souce code file has the following structure:

    package statement

    import statements

    class definition
If the name of the class defined in the file is A then the file must be named A.java.

Compiling and Executing Programs in Java

The javac program is used to compile java classes. To compile the class A, which should be defined in the file A.java, you need to give the following command.

    javac A.java

In order to be executable, the class must contain a main() method that is declared as follows.

    public static void main(String[] args) {
	main program code
    }
Then the class can be executed with the command
    java A
If the program uses command-line arguments they can be added after the class name. The command-line arguments are accessible in the main() method through its args parameter. The number of command line arguments is args.length.

Complete documentation for the javac and java programs and other tools for Java software development can be found in JDK Basic Tools. [an error occurred while processing this directive]